Code
import numpy as np
import pandas as pd
from scipy.stats import norm
kakamana
January 18, 2023
We will walk you through the steps of creating a one sample proportional test so that you will be able to better understand how hypothesis tests work and what problems they can solve. In doing so, we will also introduce important concepts such as z-scores, parabolae, and false negative and false positive errors.
This Introduction to Hypothesis Testing is part of Datacamp course: Hypothesis Testing in Python
This is my learning experience of data science through DataCamp
A/B testing: also known as split testing, refers to random experiment to test variable / outcome on treatment & control group Hypothesis: a theory or assumption yet to be proved Point estimation: sample statistics or sample mean of population mean_samp = population[‘column’].mean() Standard error: standard deviation of sample statistics or sample mean in bootstrap distribution estimates standard error std_error = np.std(so_boot_distn, ddof=1)
Since variables have different units & ranges, we need to standardize their value before testing our hypothesis standardized value = (value - mean) / standard deviation
z = (sample statistics - hypothesis param value) / standard error
Standard normal distribution: normal distribution with mean = 0 + standard deviation =1
id | country | managed_by | fulfill_via | vendor_inco_term | shipment_mode | late_delivery | late | product_group | sub_classification | ... | line_item_quantity | line_item_value | pack_price | unit_price | manufacturing_site | first_line_designation | weight_kilograms | freight_cost_usd | freight_cost_groups | line_item_insurance_usd | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 36203.0 | Nigeria | PMO - US | Direct Drop | EXW | Air | 1.0 | Yes | HRDT | HIV test | ... | 2996.0 | 266644.00 | 89.00 | 0.89 | Alere Medical Co., Ltd. | Yes | 1426.0 | 33279.83 | expensive | 373.83 |
1 | 30998.0 | Botswana | PMO - US | Direct Drop | EXW | Air | 0.0 | No | HRDT | HIV test | ... | 25.0 | 800.00 | 32.00 | 1.60 | Trinity Biotech, Plc | Yes | 10.0 | 559.89 | reasonable | 1.72 |
2 | 69871.0 | Vietnam | PMO - US | Direct Drop | EXW | Air | 0.0 | No | ARV | Adult | ... | 22925.0 | 110040.00 | 4.80 | 0.08 | Hetero Unit III Hyderabad IN | Yes | 3723.0 | 19056.13 | expensive | 181.57 |
3 | 17648.0 | South Africa | PMO - US | Direct Drop | DDP | Ocean | 0.0 | No | ARV | Adult | ... | 152535.0 | 361507.95 | 2.37 | 0.04 | Aurobindo Unit III, India | Yes | 7698.0 | 11372.23 | expensive | 779.41 |
4 | 5647.0 | Uganda | PMO - US | Direct Drop | EXW | Air | 0.0 | No | HRDT | HIV test - Ancillary | ... | 850.0 | 8.50 | 0.01 | 0.00 | Inverness Japan | Yes | 56.0 | 360.00 | reasonable | 0.01 |
5 rows × 27 columns
We’ll begin our analysis by calculating a point estimate (or sample statistic), namely the proportion of late shipments.
id country managed_by fulfill_via vendor_inco_term \
0 36203.0 Nigeria PMO - US Direct Drop EXW
1 30998.0 Botswana PMO - US Direct Drop EXW
2 69871.0 Vietnam PMO - US Direct Drop EXW
3 17648.0 South Africa PMO - US Direct Drop DDP
4 5647.0 Uganda PMO - US Direct Drop EXW
.. ... ... ... ... ...
995 13608.0 Uganda PMO - US Direct Drop DDP
996 80394.0 Congo, DRC PMO - US Direct Drop EXW
997 61675.0 Zambia PMO - US Direct Drop EXW
998 39182.0 South Africa PMO - US Direct Drop DDP
999 5645.0 Botswana PMO - US Direct Drop EXW
shipment_mode late_delivery late product_group sub_classification \
0 Air 1.0 Yes HRDT HIV test
1 Air 0.0 No HRDT HIV test
2 Air 0.0 No ARV Adult
3 Ocean 0.0 No ARV Adult
4 Air 0.0 No HRDT HIV test - Ancillary
.. ... ... ... ... ...
995 Air 0.0 No ARV Adult
996 Air 0.0 No HRDT HIV test
997 Air 1.0 Yes HRDT HIV test
998 Ocean 0.0 No ARV Adult
999 Air 0.0 No HRDT HIV test
... line_item_quantity line_item_value pack_price unit_price \
0 ... 2996.0 266644.00 89.00 0.89
1 ... 25.0 800.00 32.00 1.60
2 ... 22925.0 110040.00 4.80 0.08
3 ... 152535.0 361507.95 2.37 0.04
4 ... 850.0 8.50 0.01 0.00
.. ... ... ... ... ...
995 ... 121.0 9075.00 75.00 0.62
996 ... 292.0 9344.00 32.00 1.60
997 ... 2127.0 170160.00 80.00 0.80
998 ... 191011.0 861459.61 4.51 0.15
999 ... 200.0 14398.00 71.99 0.72
manufacturing_site first_line_designation weight_kilograms \
0 Alere Medical Co., Ltd. Yes 1426.0
1 Trinity Biotech, Plc Yes 10.0
2 Hetero Unit III Hyderabad IN Yes 3723.0
3 Aurobindo Unit III, India Yes 7698.0
4 Inverness Japan Yes 56.0
.. ... ... ...
995 Janssen-Cilag, Latina, IT Yes 43.0
996 Trinity Biotech, Plc Yes 99.0
997 Alere Medical Co., Ltd. Yes 881.0
998 Aurobindo Unit III, India Yes 16234.0
999 Inverness Japan Yes 46.0
freight_cost_usd freight_cost_groups line_item_insurance_usd
0 33279.83 expensive 373.83
1 559.89 reasonable 1.72
2 19056.13 expensive 181.57
3 11372.23 expensive 779.41
4 360.00 reasonable 0.01
.. ... ... ...
995 199.00 reasonable 12.72
996 2162.55 reasonable 13.10
997 14019.38 expensive 210.49
998 14439.17 expensive 1421.41
999 1028.18 reasonable 23.04
[1000 rows x 27 columns]
0.061
The proportion of late shipments in the sample is 0.061, or 6.1%
late_shipments_boot_distn=[0.064,
0.049,
0.06,
0.066,
0.052,
0.066,
0.071,
0.061,
0.051,
0.06,
0.053,
0.066,
0.069,
0.068,
0.063,
0.061,
0.052,
0.045,
0.054,
0.054,
0.064,
0.064,
0.058,
0.062,
0.05,
0.053,
0.064,
0.058,
0.071,
0.064,
0.052,
0.063,
0.056,
0.05,
0.058,
0.06,
0.068,
0.065,
0.056,
0.052,
0.061,
0.059,
0.054,
0.071,
0.067,
0.079,
0.069,
0.069,
0.05,
0.059,
0.062,
0.046,
0.068,
0.057,
0.067,
0.042,
0.074,
0.063,
0.056,
0.063,
0.068,
0.06,
0.068,
0.064,
0.052,
0.045,
0.058,
0.072,
0.078,
0.055,
0.069,
0.048,
0.047,
0.061,
0.066,
0.062,
0.059,
0.062,
0.054,
0.063,
0.061,
0.059,
0.057,
0.059,
0.058,
0.068,
0.067,
0.059,
0.054,
0.064,
0.047,
0.054,
0.065,
0.063,
0.057,
0.062,
0.058,
0.046,
0.052,
0.065,
0.053,
0.069,
0.068,
0.065,
0.052,
0.061,
0.058,
0.042,
0.064,
0.063,
0.068,
0.067,
0.061,
0.056,
0.061,
0.044,
0.058,
0.051,
0.075,
0.064,
0.073,
0.058,
0.056,
0.055,
0.063,
0.056,
0.067,
0.075,
0.061,
0.063,
0.051,
0.065,
0.069,
0.066,
0.05,
0.066,
0.057,
0.064,
0.065,
0.062,
0.071,
0.062,
0.065,
0.062,
0.066,
0.071,
0.058,
0.053,
0.062,
0.051,
0.056,
0.061,
0.074,
0.054,
0.059,
0.069,
0.073,
0.066,
0.052,
0.065,
0.072,
0.071,
0.059,
0.065,
0.06,
0.055,
0.053,
0.059,
0.066,
0.061,
0.053,
0.053,
0.06,
0.058,
0.074,
0.05,
0.059,
0.067,
0.06,
0.064,
0.061,
0.072,
0.06,
0.048,
0.066,
0.059,
0.08,
0.062,
0.066,
0.065,
0.06,
0.048,
0.064,
0.07,
0.053,
0.035,
0.071,
0.061,
0.051,
0.052,
0.051,
0.069,
0.052,
0.052,
0.065,
0.053,
0.055,
0.063,
0.066,
0.062,
0.067,
0.079,
0.062,
0.056,
0.058,
0.068,
0.062,
0.045,
0.063,
0.069,
0.054,
0.065,
0.061,
0.057,
0.05,
0.048,
0.069,
0.058,
0.052,
0.056,
0.057,
0.071,
0.059,
0.062,
0.064,
0.053,
0.065,
0.056,
0.06,
0.062,
0.042,
0.054,
0.051,
0.061,
0.049,
0.071,
0.072,
0.059,
0.063,
0.049,
0.074,
0.063,
0.052,
0.055,
0.072,
0.054,
0.067,
0.067,
0.067,
0.055,
0.073,
0.064,
0.069,
0.06,
0.053,
0.057,
0.056,
0.058,
0.067,
0.065,
0.064,
0.053,
0.055,
0.069,
0.058,
0.07,
0.068,
0.062,
0.062,
0.05,
0.069,
0.061,
0.057,
0.066,
0.056,
0.053,
0.055,
0.062,
0.064,
0.055,
0.056,
0.061,
0.058,
0.068,
0.079,
0.057,
0.049,
0.052,
0.063,
0.064,
0.059,
0.071,
0.064,
0.052,
0.066,
0.063,
0.069,
0.056,
0.057,
0.062,
0.057,
0.055,
0.062,
0.06,
0.064,
0.057,
0.062,
0.069,
0.067,
0.052,
0.061,
0.056,
0.055,
0.056,
0.055,
0.064,
0.068,
0.051,
0.054,
0.057,
0.054,
0.07,
0.049,
0.058,
0.063,
0.07,
0.046,
0.059,
0.064,
0.059,
0.061,
0.066,
0.06,
0.073,
0.08,
0.069,
0.061,
0.071,
0.068,
0.065,
0.063,
0.054,
0.07,
0.061,
0.053,
0.059,
0.047,
0.064,
0.071,
0.068,
0.049,
0.063,
0.057,
0.057,
0.059,
0.061,
0.048,
0.084,
0.07,
0.077,
0.043,
0.065,
0.057,
0.057,
0.054,
0.064,
0.062,
0.067,
0.068,
0.06,
0.054,
0.066,
0.048,
0.048,
0.06,
0.054,
0.067,
0.064,
0.064,
0.067,
0.058,
0.066,
0.06,
0.048,
0.058,
0.054,
0.056,
0.055,
0.068,
0.077,
0.06,
0.061,
0.055,
0.065,
0.064,
0.058,
0.058,
0.058,
0.055,
0.067,
0.061,
0.063,
0.065,
0.071,
0.051,
0.066,
0.066,
0.066,
0.07,
0.068,
0.061,
0.062,
0.054,
0.058,
0.066,
0.059,
0.061,
0.058,
0.057,
0.065,
0.053,
0.053,
0.06,
0.068,
0.067,
0.068,
0.061,
0.067,
0.059,
0.057,
0.055,
0.067,
0.058,
0.055,
0.055,
0.054,
0.061,
0.074,
0.071,
0.057,
0.056,
0.047,
0.07,
0.054,
0.052,
0.072,
0.054,
0.064,
0.063,
0.075,
0.064,
0.051,
0.061,
0.064,
0.047,
0.067,
0.061,
0.06,
0.057,
0.059,
0.058,
0.07,
0.06,
0.056,
0.064,
0.056,
0.066,
0.051,
0.064,
0.054,
0.058,
0.064,
0.041,
0.057,
0.055,
0.06,
0.06,
0.051,
0.054,
0.07,
0.053,
0.063,
0.058,
0.066,
0.059,
0.051,
0.067,
0.078,
0.056,
0.068,
0.057,
0.059,
0.062,
0.053,
0.064,
0.067,
0.068,
0.071,
0.066,
0.057,
0.063,
0.067,
0.059,
0.057,
0.064,
0.049,
0.066,
0.055,
0.071,
0.061,
0.078,
0.062,
0.052,
0.058,
0.066,
0.06,
0.054,
0.058,
0.054,
0.062,
0.072,
0.068,
0.057,
0.059,
0.066,
0.066,
0.065,
0.067,
0.071,
0.064,
0.072,
0.067,
0.064,
0.064,
0.051,
0.061,
0.047,
0.07,
0.073,
0.06,
0.066,
0.058,
0.056,
0.064,
0.059,
0.062,
0.046,
0.07,
0.07,
0.071,
0.056,
0.061,
0.066,
0.058,
0.055,
0.073,
0.068,
0.073,
0.055,
0.074,
0.063,
0.049,
0.063,
0.063,
0.056,
0.061,
0.065,
0.066,
0.06,
0.057,
0.07,
0.06,
0.053,
0.055,
0.066,
0.07,
0.069,
0.051,
0.067,
0.055,
0.06,
0.074,
0.06,
0.057,
0.06,
0.054,
0.054,
0.058,
0.06,
0.057,
0.059,
0.065,
0.061,
0.073,
0.067,
0.063,
0.079,
0.063,
0.063,
0.051,
0.074,
0.06,
0.07,
0.063,
0.072,
0.066,
0.058,
0.046,
0.059,
0.064,
0.058,
0.071,
0.055,
0.062,
0.05,
0.055,
0.061,
0.052,
0.059,
0.063,
0.058,
0.044,
0.052,
0.069,
0.056,
0.057,
0.064,
0.067,
0.058,
0.07,
0.065,
0.068,
0.061,
0.055,
0.06,
0.053,
0.066,
0.052,
0.064,
0.051,
0.076,
0.069,
0.056,
0.057,
0.068,
0.07,
0.065,
0.062,
0.066,
0.063,
0.066,
0.054,
0.061,
0.061,
0.055,
0.053,
0.054,
0.065,
0.073,
0.064,
0.054,
0.065,
0.06,
0.059,
0.056,
0.064,
0.057,
0.06,
0.07,
0.063,
0.064,
0.067,
0.061,
0.053,
0.06,
0.064,
0.064,
0.057,
0.046,
0.057,
0.065,
0.074,
0.062,
0.063,
0.054,
0.074,
0.064,
0.077,
0.068,
0.06,
0.063,
0.059,
0.06,
0.068,
0.052,
0.064,
0.057,
0.059,
0.069,
0.061,
0.064,
0.047,
0.062,
0.069,
0.054,
0.069,
0.063,
0.077,
0.06,
0.061,
0.055,
0.069,
0.061,
0.06,
0.061,
0.067,
0.05,
0.061,
0.062,
0.081,
0.071,
0.057,
0.055,
0.054,
0.07,
0.068,
0.063,
0.056,
0.081,
0.049,
0.07,
0.048,
0.046,
0.069,
0.056,
0.066,
0.058,
0.058,
0.062,
0.052,
0.065,
0.043,
0.062,
0.063,
0.053,
0.073,
0.058,
0.064,
0.071,
0.073,
0.059,
0.08,
0.052,
0.053,
0.053,
0.053,
0.057,
0.061,
0.069,
0.046,
0.063,
0.078,
0.06,
0.06,
0.064,
0.063,
0.065,
0.069,
0.059,
0.068,
0.061,
0.066,
0.064,
0.064,
0.058,
0.046,
0.073,
0.06,
0.056,
0.073,
0.07,
0.058,
0.056,
0.064,
0.069,
0.065,
0.063,
0.063,
0.054,
0.081,
0.044,
0.048,
0.059,
0.058,
0.046,
0.063,
0.072,
0.063,
0.059,
0.063,
0.047,
0.063,
0.065,
0.071,
0.061,
0.05,
0.063,
0.065,
0.054,
0.053,
0.061,
0.054,
0.063,
0.056,
0.071,
0.057,
0.058,
0.049,
0.074,
0.057,
0.058,
0.07,
0.063,
0.057,
0.052,
0.064,
0.074,
0.047,
0.071,
0.051,
0.059,
0.05,
0.059,
0.05,
0.05,
0.057,
0.075,
0.053,
0.07,
0.062,
0.062,
0.075,
0.058,
0.057,
0.05,
0.062,
0.061,
0.067,
0.062,
0.059,
0.059,
0.049,
0.052,
0.062,
0.069,
0.062,
0.054,
0.05,
0.063,
0.052,
0.063,
0.069,
0.057,
0.067,
0.064,
0.057,
0.057,
0.057,
0.05,
0.062,
0.069,
0.075,
0.075,
0.05,
0.06,
0.065,
0.051,
0.063,
0.075,
0.06,
0.058,
0.063,
0.069,
0.055,
0.062,
0.06,
0.057,
0.079,
0.046,
0.059,
0.07,
0.055,
0.08,
0.048,
0.061,
0.042,
0.068,
0.082,
0.044,
0.054,
0.063,
0.054,
0.071,
0.053,
0.061,
0.06,
0.065,
0.072,
0.063,
0.062,
0.053,
0.072,
0.067,
0.058,
0.075,
0.07,
0.052,
0.056,
0.056,
0.082,
0.055,
0.056,
0.057,
0.056,
0.054,
0.073,
0.081,
0.063,
0.063,
0.054,
0.058,
0.062,
0.065,
0.063,
0.062,
0.056,
0.063,
0.06,
0.061,
0.068,
0.067,
0.07,
0.059,
0.06,
0.063,
0.057,
0.052,
0.062,
0.064,
0.065,
0.07,
0.063,
0.062,
0.052,
0.055,
0.055,
0.053,
0.057,
0.058,
0.062,
0.06,
0.056,
0.064,
0.074,
0.071,
0.059,
0.056,
0.063,
0.059,
0.058,
0.054,
0.058,
0.069,
0.06,
0.063,
0.054,
0.047,
0.061,
0.057,
0.059,
0.057,
0.063,
0.06,
0.071,
0.062,
0.06,
0.071,
0.059,
0.049,
0.077]
# Hypothesize that the proportion is 6%
late_prop_hyp = 0.06
#for i in range(5000):
#np.mean(late_shipments_boot_distn.append(late_shipments.sample(frac=1, replace=True)['late']))
#print(late_shipments_boot_distn)
# Calculate the standard error
std_error = np.std(late_shipments_boot_distn,ddof=1)
# Find z-score of late_prop_samp
z_score = (late_prop_samp - late_prop_hyp) / std_error
# Print z_score
print(z_score)
print("\nThe z-score is a standardized measure of the difference between the sample statistic and the hypothesized statistic")
0.13387997080083944
The z-score is a standardized measure of the difference between the sample statistic and the hypothesized statistic
Hypothesis tests check if the sample statistics lie in the tails of the null distribution
alternative different from null : Two Tail test alternative greater than null : right tail test alternative lower than null: left tail test
p-values measure the strength of support for the null hypothesis, or in other words, they measure the probability of obtaining a result, assuming the null hypothesis is true. Large p-values mean our statistic is producing a result that is likely not in a tail of our null distribution, and chance could be a good explanation for the result. Small p-values mean our statistic is producing a result likely in the tail of our null distribution. Because p-values are probabilities, they are always between zero and one
Calculating p-value Left tail test: norm.cdf() right tail test: 1-norm.cdf()
p-values quantify evidence for the null hypothesis large p-value => fail to reject null hypothesis small p-value => reject null hypothesis
Type I and type II errors
For hypothesis tests and for criminal trials, there are two states of truth and two possible outcomes. Two combinations are correct test outcomes, and there are two ways it can go wrong.
The errors are known as false positives (or “type I errors”), and false negatives (or “type II errors”).